17 research outputs found

    Expert Knowledge-Guided Length-Variant Hierarchical Label Generation for Proposal Classification

    Full text link
    To advance the development of science and technology, research proposals are submitted to open-court competitive programs developed by government agencies (e.g., NSF). Proposal classification is one of the most important tasks to achieve effective and fair review assignments. Proposal classification aims to classify a proposal into a length-variant sequence of labels. In this paper, we formulate the proposal classification problem into a hierarchical multi-label classification task. Although there are certain prior studies, proposal classification exhibit unique features: 1) the classification result of a proposal is in a hierarchical discipline structure with different levels of granularity; 2) proposals contain multiple types of documents; 3) domain experts can empirically provide partial labels that can be leveraged to improve task performances. In this paper, we focus on developing a new deep proposal classification framework to jointly model the three features. In particular, to sequentially generate labels, we leverage previously-generated labels to predict the label of next level; to integrate partial labels from experts, we use the embedding of these empirical partial labels to initialize the state of neural networks. Our model can automatically identify the best length of label sequence to stop next label prediction. Finally, we present extensive results to demonstrate that our method can jointly model partial labels, textual information, and semantic dependencies in label sequences, and, thus, achieve advanced performances.Comment: 10 pages, Accepted as regular paper by ICDM 202

    Semi-supervised Domain Adaptation in Graph Transfer Learning

    Full text link
    As a specific case of graph transfer learning, unsupervised domain adaptation on graphs aims for knowledge transfer from label-rich source graphs to unlabeled target graphs. However, graphs with topology and attributes usually have considerable cross-domain disparity and there are numerous real-world scenarios where merely a subset of nodes are labeled in the source graph. This imposes critical challenges on graph transfer learning due to serious domain shifts and label scarcity. To address these challenges, we propose a method named Semi-supervised Graph Domain Adaptation (SGDA). To deal with the domain shift, we add adaptive shift parameters to each of the source nodes, which are trained in an adversarial manner to align the cross-domain distributions of node embedding, thus the node classifier trained on labeled source nodes can be transferred to the target nodes. Moreover, to address the label scarcity, we propose pseudo-labeling on unlabeled nodes, which improves classification on the target graph via measuring the posterior influence of nodes based on their relative position to the class centroids. Finally, extensive experiments on a range of publicly accessible datasets validate the effectiveness of our proposed SGDA in different experimental settings

    Kernel-based Substructure Exploration for Next POI Recommendation

    Full text link
    Point-of-Interest (POI) recommendation, which benefits from the proliferation of GPS-enabled devices and location-based social networks (LBSNs), plays an increasingly important role in recommender systems. It aims to provide users with the convenience to discover their interested places to visit based on previous visits and current status. Most existing methods usually merely leverage recurrent neural networks (RNNs) to explore sequential influences for recommendation. Despite the effectiveness, these methods not only neglect topological geographical influences among POIs, but also fail to model high-order sequential substructures. To tackle the above issues, we propose a Kernel-Based Graph Neural Network (KBGNN) for next POI recommendation, which combines the characteristics of both geographical and sequential influences in a collaborative way. KBGNN consists of a geographical module and a sequential module. On the one hand, we construct a geographical graph and leverage a message passing neural network to capture the topological geographical influences. On the other hand, we explore high-order sequential substructures in the user-aware sequential graph using a graph kernel neural network to capture user preferences. Finally, a consistency learning framework is introduced to jointly incorporate geographical and sequential information extracted from two separate graphs. In this way, the two modules effectively exchange knowledge to mutually enhance each other. Extensive experiments conducted on two real-world LBSN datasets demonstrate the superior performance of our proposed method over the state-of-the-arts. Our codes are available at https://github.com/Fang6ang/KBGNN.Comment: Accepted by the IEEE International Conference on Data Mining (ICDM) 202

    Interdisciplinary Fairness in Imbalanced Research Proposal Topic Inference: A Hierarchical Transformer-based Method with Selective Interpolation

    Full text link
    The objective of topic inference in research proposals aims to obtain the most suitable disciplinary division from the discipline system defined by a funding agency. The agency will subsequently find appropriate peer review experts from their database based on this division. Automated topic inference can reduce human errors caused by manual topic filling, bridge the knowledge gap between funding agencies and project applicants, and improve system efficiency. Existing methods focus on modeling this as a hierarchical multi-label classification problem, using generative models to iteratively infer the most appropriate topic information. However, these methods overlook the gap in scale between interdisciplinary research proposals and non-interdisciplinary ones, leading to an unjust phenomenon where the automated inference system categorizes interdisciplinary proposals as non-interdisciplinary, causing unfairness during the expert assignment. How can we address this data imbalance issue under a complex discipline system and hence resolve this unfairness? In this paper, we implement a topic label inference system based on a Transformer encoder-decoder architecture. Furthermore, we utilize interpolation techniques to create a series of pseudo-interdisciplinary proposals from non-interdisciplinary ones during training based on non-parametric indicators such as cross-topic probabilities and topic occurrence probabilities. This approach aims to reduce the bias of the system during model training. Finally, we conduct extensive experiments on a real-world dataset to verify the effectiveness of the proposed method. The experimental results demonstrate that our training strategy can significantly mitigate the unfairness generated in the topic inference task.Comment: 19 pages, Under review. arXiv admin note: text overlap with arXiv:2209.1391

    Graph Soft-Contrastive Learning via Neighborhood Ranking

    Full text link
    Graph Contrastive Learning (GCL) has emerged as a promising approach in the realm of graph self-supervised learning. Prevailing GCL methods mainly derive from the principles of contrastive learning in the field of computer vision: modeling invariance by specifying absolutely similar pairs. However, when applied to graph data, this paradigm encounters two significant limitations: (1) the validity of the generated views cannot be guaranteed: graph perturbation may produce invalid views against semantics and intrinsic topology of graph data; (2) specifying absolutely similar pairs in the graph views is unreliable: for abstract and non-Euclidean graph data, it is difficult for humans to decide the absolute similarity and dissimilarity intuitively. Despite the notable performance of current GCL methods, these challenges necessitate a reevaluation: Could GCL be more effectively tailored to the intrinsic properties of graphs, rather than merely adopting principles from computer vision? In response to this query, we propose a novel paradigm, Graph Soft-Contrastive Learning (GSCL). This approach facilitates GCL via neighborhood ranking, avoiding the need to specify absolutely similar pairs. GSCL leverages the underlying graph characteristic of diminishing label consistency, asserting that nodes that are closer in the graph are overall more similar than far-distant nodes. Within the GSCL framework, we introduce pairwise and listwise gated ranking InfoNCE loss functions to effectively preserve the relative similarity ranking within neighborhoods. Moreover, as the neighborhood size exponentially expands with more hops considered, we propose neighborhood sampling strategies to improve learning efficiency. Our extensive empirical results across 11 commonly used graph datasets-including 8 homophily graphs and 3 heterophily graphs-demonstrate GSCL's superior performance compared to 20 SOTA GCL methods

    T-cell infiltration in the central nervous system and their association with brain calcification in Slc20a2-deficient mice

    Get PDF
    Primary familial brain calcification (PFBC) is a rare neurodegenerative and neuropsychiatric disorder characterized by bilateral symmetric intracranial calcification along the microvessels or inside neuronal cells in the basal ganglia, thalamus, and cerebellum. Slc20a2 homozygous (HO) knockout mice are the most commonly used model to simulate the brain calcification phenotype observed in human patients. However, the cellular and molecular mechanisms related to brain calcification, particularly at the early stage much prior to the emergence of brain calcification, remain largely unknown. In this study, we quantified the central nervous system (CNS)-infiltrating T-cells of different age groups of Slc20a2-HO and matched wild type mice and found CD45+CD3+ T-cells to be significantly increased in the brain parenchyma, even in the pre-calcification stage of 1-month-old -HO mice. The accumulation of the CD3+ T-cells appeared to be associated with the severity of brain calcification. Further immunophenotyping revealed that the two main subtypes that had increased in the brain were CD3+ CD4− CD8– and CD3+ CD4+ T-cells. The expression of endothelial cell (EC) adhesion molecules increased, while that of tight and adherents junction proteins decreased, providing the molecular precondition for T-cell recruitment to ECs and paracellular migration into the brain. The fusion of lymphocytes and EC membranes and transcellular migration of CD3-related gold particles were captured, suggesting enhancement of transcytosis in the brain ECs. Exogenous fluorescent tracers and endogenous IgG and albumin leakage also revealed an impairment of transcellular pathway in the ECs. FTY720 significantly alleviated brain calcification, probably by reducing T-cell infiltration, modulating neuroinflammation and ossification process, and enhancing the autophagy and phagocytosis of CNS-resident immune cells. This study clearly demonstrated CNS-infiltrating T-cells to be associated with the progression of brain calcification. Impairment of blood–brain barrier (BBB) permeability, which was closely related to T-cell invasion into the CNS, could be explained by the BBB alterations of an increase in the paracellular and transcellular pathways of brain ECs. FTY720 was found to be a potential drug to protect patients from PFBC-related lesions in the future

    A Comprehensive Survey on Deep Graph Representation Learning

    Full text link
    Graph representation learning aims to effectively encode high-dimensional sparse graph-structured data into low-dimensional dense vectors, which is a fundamental task that has been widely studied in a range of fields, including machine learning and data mining. Classic graph embedding methods follow the basic idea that the embedding vectors of interconnected nodes in the graph can still maintain a relatively close distance, thereby preserving the structural information between the nodes in the graph. However, this is sub-optimal due to: (i) traditional methods have limited model capacity which limits the learning performance; (ii) existing techniques typically rely on unsupervised learning strategies and fail to couple with the latest learning paradigms; (iii) representation learning and downstream tasks are dependent on each other which should be jointly enhanced. With the remarkable success of deep learning, deep graph representation learning has shown great potential and advantages over shallow (traditional) methods, there exist a large number of deep graph representation learning techniques have been proposed in the past decade, especially graph neural networks. In this survey, we conduct a comprehensive survey on current deep graph representation learning algorithms by proposing a new taxonomy of existing state-of-the-art literature. Specifically, we systematically summarize the essential components of graph representation learning and categorize existing approaches by the ways of graph neural network architectures and the most recent advanced learning paradigms. Moreover, this survey also provides the practical and promising applications of deep graph representation learning. Last but not least, we state new perspectives and suggest challenging directions which deserve further investigations in the future
    corecore